The mitochondrial genome of Dendrobaena tellermanica Perel, 1966 (Annelida: Lumbricidae) and its phylogenetic position

Earthworms are an important ecological group that has a significant impact on soil fauna as well as plant communities. Despite their importance, genetic diversity and phylogeny of earthworms are still insufficiently studied. Most studies on earthworm genetic diversity are currently based on a few mitochondrial and nuclear genes. Mitochondrial genomes are becoming a promising target for phylogeny reconstruction in earthworms. However, most studies on earthworm mitochondrial genomes were made on West European and East Asian species, with much less sampling from other regions. In this study, we performed sequencing, assembly, and analysis of the mitochondrial genome of Dendrobaena tellermanica Perel, 1966 from the Northern Caucasus. This species was earlier included into D. schmidti (Michaelsen, 1907), a polytypic species with many subspecies. The genome was assembled as a single contig 15,298 bp long which contained a typical gene set: 13 protein-coding genes (three subunits of cytochrome c oxidase, seven subunits of NADH dehydrogenase, two subunits of ATP synthetase, and cytochrome b), 12S and 16S ribosomal RNA genes, and 22 tRNA genes. All genes were located on one DNA strand. The assembled part of the control region, located between the tRNA-Arg and tRNA-His genes, was 727 bp long. The control region contained multiple hairpins, as well as tandem repeats of the AACGCTT monomer. Phylogenetic analysis based on the complete mitochondrial genomes indicated that the genus Dendrobaena occupied the basal position within Lumbricidae. D. tellermanica was a rather distant relative of the cosmopolitan D. octaedra, suggesting high genetic diversity in this genus. D. schmidti turned out to be paraphyletic with respect to D. tellermanica. Since D. schmidti is known to contain very high genetic diversity, these results may indicate that it may be split into several species.


Introduction
Earthworms are an important ecological group that accounts for the highest biomass among the soil fauna in many habitats (Hendrix et al., 2008). Its representatives process plant detritus to soil humus and return organic matter to the global cycles (Blouin et al., 2013). Earthworms also form soil structure, which has high impact on both soil fauna composition and vegetation (Lavelle et al., 2016). Therefore, this group defines ecosystem productivity in many respects.
Genetic diversity and phylogeny of earthworms remain insufficiently studied (Marchán et al., 2018). Currently, most works on earthworm genetic diversity are based on single mitochondrial and nuclear genes (Jamieson et al., 2002;Mar chán et al., 2022). Construction of multigene nuclear data sets is im peded by frequent polyploidy characteristic for this group (Vik torov, 1997;Vsevolodova-Perel, Bulatova, 2008;Mezhzhe rin et al., 2018), which makes it hard to detect suitable orthologs and amplify them by PCR.
Mitochondrial genomes are thus a promising tool for reconstruction of phylogenetic relationships in earthworms. A lot of mitochondrial genomes were sequenced and published in recent years (Zhang L. et al., 2014(Zhang L. et al., -2016aWang et al., 2015;Conrado et al., 2017;Hong et al., 2017;Shekhovtsov, Peltek, 2019;Zhang Q. et al., 2019;Liu et al., 2020;Seto et al., 2021;Csuzdi et al., 2022;Kim, Hong, 2022), and studies on phylogenetic relationships of certain groups were also conducted (Shekhovtsov et al., 2020a;Liu et al., 2021). However, almost all of these studies were made on species from West Europe and East Asia, with almost no representatives of other regions.
In this study, we performed sequencing, assembly, and analysis of the mitochondrial genome of Dendrobaena tel ler manica Perel, 1966. This species was earlier included into D. schmidti (Michaelsen, 1907), a polytypic species that was considered to contain multiple subspecies (Michaelsen, 1907;Kvavadze, 1985). D. tellermanica was believed to be a parthenogenetic form of D. schmidti (Perel, 1966). T.S. Vsevolodova-Perel (2003) demonstrated that many populations of D. tellermanica are amphimictic (sexual) and so isolated it into a separate species. D. tellermanica differs from D. schmidti by the lack of pigmentation, different position of the clitellum and the form of tuberculae pubertatis (Vsevolodova-Perel, 2003).
Currently, there is only one complete mitochondrial genome of the genus Dendrobaena in GenBank belonging to the cosmopolitan D. octaedra (Savigny, 1826). The mitochondrial genome of D. tellermanica will be the first sequenced mitochondrial genome of a Caucasian earthworm and will be important for studying the phylogeny of lumbricids.

Materials and methods
Specimens of D. tellermanica were collected in the Kara chay-Cherkess Republic (right bank of r. Uchkulan, road to the Chiper pass, 1483 a. s. l., 4-5 km from the Aktyube town, Al chemilla and Geranium meadow, N 43.410944, E 42.174538). Worms were fixed in ethanol. Morphological identification was performed according to the key of T.S. Vsevolodova-Perel (1997).
DNA was extracted using the standard phenol-chloroform method and sonicated on Covaris M220 to the target fragment length of 350 bp. The fragments were purified by 1.2 volume of AMPureXP (Beckman Coulter, USA) and quantified using fluorometry on a Qubit device. Genomic libraries were obtained from 100 ng of DNA using Roche KAPA Hyper Prep according to the manufacturer's protocol using KAPA UDI Adapter double barcodes. Quality and molarity of the obtained genomic library was assessed on a BA2100 bioanalyzer using the Agilent DNA High Sensitivity Kit and sequenced on an Illumina NextSeq550 with the Mid Output Kit v. 2.5 (300 Cycles) for 2×150 bp paired reads.

Results
We obtained 4.2 million paired reads for the D. tellermanica genomic library; 3.4 million remained after processing. Median coverage of genome contigs longer than 500 bp was estimated at 6 (average -20), median coverage for the mitochondrial contig was 20 (average -30). The assembled mitochondrial contig was 15,298 bp long and contained the typical set of genes: 13 protein-coding genes (three subunits of cytochrome oxidase, seven subunits of NADH dehydrogenase, two subunits of ATP synthase, and cytochrome b), 12S and 16S ribosomal RNA genes, and 22 tRNA genes. All genes were located on one DNA strand (Fig. 1). AT-con tent was 65.3 %. The leading strand contained 31.1 % А, 34.2 % Т, 13.9 % G, and 20.8 % C. The ND4 and ND4L genes overlapped by 7 bp. ATG was the only start codon used. Three protein coding genes (COIII, ND1 и ND2) had an abbreviated stop codon. Transport RNA genes were 60 to 69 bp long, their predicted structures are shown on Fig. 2.
The region between the tRNAArg and tRNAHis genes is known as the control region. A total of 727 bp were assigned to it. The control region could not be assembled, so the final sequence contained a gap. Its АТ-content (63.5 %) was close to the genome average, and its sequence contained multiple hairpins (see Fig. 2). It also included 11 tandem repeats of the AACGCTT monomer.

Organization of mitochondrial genome in earthworms
For a long time, the study of J.L. Boore and W.M. Brown (1995) on Lumbricus ter restris was the only description of an earthworm mitochondrial genome. It was 14,998 bp long and contained a set of genes usual for animal mitochondria. The hallmark of earthworm mitochondrial genomes, as well of Annelida as a whole, with few exceptions, is that all genes are located on a single strand (Weigert et al., 2016). All mitochondrial genes in earthworms are presumably expressed as a single transcript (Vallès, Boore, 2006). In this case, any inversions and most of the translocations will be non-viable, which leads to fixed gene positions in the genome. Indeed, mitochondrial gene order in annelids is highly conserved, and all Clitellata have identical gene order. We also failed to find any deviations from this rule.
While mtDNA gene order in earthworms is conserved, its sequence is highly variable, which is especially pronounced for the control region (also referred to as the D-loop). The control region acts as the replication origin, promotor, and the regulatory region for the mitochondrial gene expression (Clayton, 1992).
Organization of the control region varies in different earthworm taxa. Among the representatives of Glossoscolecidae and Megascolecidae they are short, usually less than 500 bp (Zhang L. et al., 2016a;Hong et al., 2017;Zhang Q. et al., 2019;Seto et al., 2021;Kim, Hong, 2022), while in two species of Dra wida (Moniligastridae) these sequences were completely absent (Liu et al., 2020). In Lumbricidae, the length of the control region varies from 400 bp in L. terres tris to 2000 bp in Eisenia fetida. For many species, control regions could not be amplified (Shekhovtsov et al., 2020a) or even recovered using NGS methods (Zhao et al., 2022). Here, we also failed to amplify the lacking part of the control region of D. tellermanica. This could be caused by its length or complex secondary structure: Fig. 2 demonstrates that almost all of the control region forms hairpins.

The phylogeny of earthworms based on mitochondrial genomes and the position of D. tellermanica
Phylogenetic analysis based on complete mitochondrial genomes (Fig. 3) suggests that Moniligastridae is distantly related to other earthworm families. Glossoscolecidae, represented here by a single species Pontoscolex corethru rus, occupied the basal position within the order Crassiclitellata. Megascolecidae and Lumbricidae, which were the most densely sampled, turned out as sister groups. Within the family Lumbricidae, which includes D. teller manica, the genus Dendrobaena was the sister group to all other genera of the family. There are only two known mitochondrial sequences for the genus Dendrobaena, the cosmopolitan D. octaedra and D. tellermanica obtained in this study. Expectedly, D. tellermanica forms a clade with D. octaedra, but they are rather distantly related.

149
We can conclude that this work is a first step in the study of the basal branches of Lumbricidae. Representatives of the genus Dendrobaena from the Caucasus are particularly interesting in this respect, because they account for a large part of its species diversity.
Earlier we performed a genetic analysis of morphological forms of D. schmidti (Shekhovtsov et al., 2020b), demonstrating that it represents at least two separate species. On the phylogenetic tree constructed using the COX1 gene (Fig. 4), D. tellermanica was inside one of the branches of D. schmid ti.
We should note that single mitochondrial genes, including COX1, are unsuitable for phylogenetic reconstruction on the family level (Klarica et al., 2012;Shekhovtsov et al., 2016Shekhovtsov et al., , 2020c, since they demonstrate poor resolution of the relationships between species and do not support the monophyly of most genera. COX1 is however of much use in the search for closely related species or genetic lineages. Moreover, there are thousands of COX1 sequences in the public databases and only a few mitochondrial genomes; e. g., mtDNA of D. schmid ti has not been sequenced yet. Therefore, the tree on Fig. 4 is given only to demonstrate the close relationship of D. tellermanica and D. schmidti.

Conclusion
The obtained preliminary results indicate that D. tellermanica could be treated as a subspecies of D. schmidti, as was believed earlier, or split D. schmidti into several species. The latter The mitochondrial genome of Dendrobaena tellermanica Perel, 1966 (Annelida: Lumbricidae)   option is supported by the high genetic and morphological variation within this species. However, such conclusions would require an analysis based on several loci, including nuclear ones.